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ZINC FINGER BINDING DOMAINS FOR CNN 
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Funds used to support some of the studies reported herein were provided by the 
National Institutes ofHealth(Nffl[GM 53910). The United States Government, therefore, 
may have certain rights in the invention. 

Cross Reference to Related Applications 

This application is a continuation-in-part of United States Provisional Patent 
Applications Serial Numbers 60/313,864 and 60/313,693, filed August 20, 2001, the 
disclosures of which are incorporated herein by reference. 

Technical Field of the Invention 

The field of this invention is zinc finger protein binding to target nucleotides. More 
particularly, the present invention pertains to amino acid residue sequences within the a- 
helical domain of zinc fingers that specifically bind to target nucleotides of the formula 5 - 
(CNN)-3\ 

Background of the Invention 

The construction of artificial transcription factors has been of great interest in the past 
years. Gene expression can be specifically regulated by polydactyl zinc finger proteins fiised 
to regulatory domains. Zinc finger domains of the Cysj-Hisj family have been most 
promising for the construction of artificial transcription factors due to their modular structure. 
Each domain consists of approximately 30 amino acids and folds into a a-structure stabilized 
by hydrophobic interactions and chelation of a zinc ion by the conserved CySj-HiSj residues. 
To date, the best characterized protem of this family of zinc finger proteins is the mouse 
transcription factor Zif 268 [Pavletich et al., (1991) Science 252(5007), 809-817; Ekod- 
Erickson et al., (1996) Structure 4(10), 1 171-1 180]. The analysis of the Zif 268/DNA 
complex suggested that DNA binding is predominantly achieved by the interaction of amino 
acid residues of the a-helix in position -1, 3, and 6 with the 3*, middle, and 5' nucleotide of a 
3 bp DNA subsite, respectively. Positions 1, 2 and 5 have been shown to make direct or 
water-mediated contacts with the phosphate backbone of the DNA. Leucine is usually found 
in position 4 and packs into the hydrophobic core of the domain. Position 2 of the a-helix has 
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been shown to interact with other helix residues and, in addition, can make contact to a 
nucleotide outside the 3 bp subsite [Pavletich et al., (1991) Science 252(5007), 809-817; 
EIrod-Erickson et al,, (1996) Structure 4(10), 1171-1180; Isalan, M. et al., (1997) ProcNatl 
AcadSci USA 94(11), 5617-5621]. 

The selection of modular zinc finger domains recogni2dng each of the 5'-GNN-3' 
DNA subsites with high specificity and affinity and their refinement by site-directed 
mutagenesis has been demonstrated (United States Patent No. 6,140,081, the disclosure of 
which is incorporated herein by reference). These modular domains can be assembled into 
zinc finger proteins recognizing extended 18 bp DNA sequences which are unique within the 
human or any other genome. In addition, these proteins fimction as transcription factors and 
are capable of altering gene expression when fiised to regulatory domains and can even be 
made hormone-dependent by fusion to ligand-binding domains of nuclear, hormone receptors. 
To allow the r^id construction of zinc finger-based transcription factors binding to any DNA 
sequence it is important to extend the existing set of modular zinc finger domains to 
recognize each of the 64 possible DNA triplets. This aim can be achieved by phage display 
selection and/or rational design. Due to the limited stmctural data on zinc finger/DNA 
interaction, rational design of zinc proteins is very time-consxmiing and may not be possible 
in many instances. In addition, most naturally occurring zinc finger proteins consist of 
domains recognizing the 5'-(GNN)-3' type of DNA sequences. The most promising 
approach to identify novel zinc finger domains binding to DNA target sequences of the type 
5'-NNN-3' is selection via phage display. The limiting step for this approach is the 
construction of libraries that allow the specification of a 5' adenine, cytosine or thymine. 
Phage display selections have been based on Zif268 in which dififerent fingers of this protein 
were randomized [Choo et al., (1994) Proa Natl Acad. Set U, S. A. 91(23), 1 1 168-72; Rebar 
et al, (1994) Science (Washington, D. C, 1883-) 263(5147), 671-3; Jamieson et al., (1994) 
Biochemistry 33, 5689-5695; Wu et al., (1995) PNAS 92, 344-348; Jamieson et al., (1996) 
Proc Natl Acad Sci USA 93, 12834-12839; Greisman et al., (1997) Science 275(5300), 657- 
661]. A set of 16 domains recognizing the 5'-GNN-3' type of DNA sequences has previously 
been reported fi*om a library where finger 2 of C7, a derivative of Zif268 [Wu et al., (1995) 
PNAS 92, 344-348 Wu, 1995], was randomized [Segal et al., (1999) Proc Nad Acad Sci US 
A 96(6), 2758-2763]. In such a strategy, selection is limited to domains recognizing 5'-GNN- 
3* or 5'-TNN-3' due to the Asp^ of finger 3 making contact with the complementary base of a 
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5' guanine or thymine in the finger-2 subsite [Pavletich et al., (1991) Science 252(5007), 809- 
817; Eht)d-Erickson et al., (1996) Structure 4(10), 1 171-1 180]. 

The present approach is based on the modularity of zinc finger domains that allows 
the rapid construction of zinc finger proteins by the scientific community and demonstrates 
5 that the concerns regarding limitation imposed by cross-subsite interactions only occurs in a 
limited number of cases. The present disclosure introduces a new strategy for selection of 
zinc finger domains specifically recognizing the 5'-CNN-3' type of DNA sequences. Specific 
DNA-binding properties of these domains was evaluated by a multi-target ELISA against all 
sixteen 5'-CNN-3 ' triplets. These domains can be readily incorporated into polydactyl 

1 0 proteins containing various numbers of 5 'tCNN-3 ' domains, each specifically recognizing 

extended 18 bp sequences. Furthermore, these domains can specifically alter gene expression 
when fiised to regulatory domains. These results underhne the feasibility of constructing 
polydactyl proteins fi-om pre-defined building blocks. In addition, the domains characterized 
here greatly increase the number of DNA sequences that can be targeted with artificial 

15 transcription factors. 

Brief Sunmiarv of the Invention 

In one aspect, the present invention provides an isolated and purified zinc finger 
nucleotide binding polypeptide that contains a nucleotide binding region of firom 5 to 1 0 

20 amino acid residues, which region binds preferentially to a target nucleotide of the formula 

CNN, where N is A, C, G or T. Preferably, the target nucleotide has the formula CAA, CAC, 
CAG, CAT, CCA, CCC, CCG, CCT, CGA, CGC, CGG, CGT, CTA, CTC, CTG or CTT. 
In one embodiment, a polypeptide of the invention contains a binding region that has an 
amino acid residue sequence with the same nucleotide binding characteristics as any of SEQ 

25 ED NOs:l-25. Such a polypeptide competes for bindnig to a nucleotide target with any of 

SEQ ID NOs:l-25. Preferably, the binding region has the amino acid residue sequence of any 
of SEQ ID NOs:l-25. In one embodiment, this invention provides an isolated and purified 
zinc finger nucleotide binding polypeptide consisting of an amino acid residue sequence of 
any of SEQ ID NOs:l-25. 

30 In another aspect, the present invention provides a peptide composition that contains a 

plurality of and, preferably firom about 2 to about 12 of a zinc finger nucleotide binding 
polypeptide as disclosed herein. The polypeptides are operatively linked such as linked via a 
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flexible peptide linker of fix)m 5 to 15 amino acid residues. Operatively linked preferably 
occurs via a flexible peptide linker such as that shown in SEQ ID NO:30. Such a 
composition binds to a nucleotide sequence that contaius a sequence of the formula 5'- 
(C3s[N)n-3', where N is A, C, G or T and n is 2 to 12. Preferably, the composition contains 
fix)m about 2 to about 6 zinc finger nucleotide binding polypeptides and binds to a nucleotide 
sequence that contains a sequence of the formula 5'-(CNN)n-3', where n is 2 to 6. Binding 
occurs with a Kp of firom 1 fM to 10 m-M. Preferably binding occurs with a of firom 10 fM 
to 1 ^iM, from 10 pM to 100 nM, from 100 pM to 10 nM and, more preferably with a of 
from 1 nM to 10 nM. In preferred embodiments, both a polypeptide and a composition of 
this invention are operatively hnked to one or more transcription regulating factors such as a 
repressor of transcription or an activator of transcription. 

The present invention further provides polynucleotides that encode a polypeptide or a 
composition of this invention, expression vectors that contain such polynucleotides and host 
cells transformed with the polynucleotide or expression vector. 

The present invention further provides a process of regulating expression of a 
nucleotide sequence that contains the target nucleotide sequence 5'-(CNN)-3'. The target 
nucleotide sequence can be located anywhere within a longer 5 -(NNN)-3' sequence. The 
process includes the step of exposing the nucleotide sequence to an effective amount of a zinc 
finger nucleotide binding polypeptide or composition as set forth herein. In one embodiment, 
a process regulates expression of a nucleotide sequence that contains the sequence 5 -(CNN)n- 
3*, where n is 2 to 12. The process includes the step of exposing the nucleotide sequence to 
an effective amount of a composition of this invention. The sequence 5 -(CNN)n-3* can be 
located in the transcribed region of the nucleotide sequence, in a promotor region of the 
nucleotide sequence, or within an expressed sequence tag. The composition is preferably 
operatively linked to one or more transcription regulating factors such as a repressor of 
transcription or an activator of transcription. In one embodiment, the nucleotide sequence is a 
gene such as a eukaiyotic gene, a prokaryotic gene or a viral gene. The eukaryotic gene can 
be a mammalian gene such as a hmnan gene or a plant gene. The prokaryotic gene can be a 
bacterial gene. 



Brief Description of the Drawings 

In the drawings that form a portion of the specification, FIG. 1 shows, in two panels 



wo 03/016496 PCT/US02/26388 

5 

designated 1 A and IB, schematically, construction of the zinc finger phage display library (A) 
and multitarget specificity BUS A for the C7 proteins (B). 

Detailed Description of the Invention 
Definitinns 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as is commonly imderstood by one of skill in the art to which this invention belongs. 

As used herein, the transcription regulating domain or factor refers to the portion of 
the fusion polypeptide provided herein that fimctions to regulate gene transcription. 
Exemplary and preferred transcription repressor domains are ERD, KRAB, SID, Deacetylase, 
and derivatives, multimers and combinations thereof such as KRAB-ERD, SID-ERD, 
(KRAB)2, (KRAB)3, KRAB-A, (KRAB-A)^, (SID)^, (KRAB-A)-SID and SID-(KRAB-A). 
As used herein, nucleotide binding domain or region, refers to the portion of a polypeptide or 
composition provided herein that provides specific nucleic acid binding capability. The 
nucleotide binding region fimctions to target a subject polypeptide to specific genes. 
As used herein, operatively linked means that elements of a polypeptide, for example, are 
linked such that each perform or functions as intended. For example, a repressor is attached 
to the binding domain in such a manner that, when bound to a target nucleotide via that 
binding domain, the repressor acts to inhibit or prevent transcription. Linkage between and 
among elements may be direct or indirect, such as via a linker. The elements are not 
necessarily adjacent. Hence a repressor domain can be linked to a nucleotide binding domain 
using any linking procedme well known in the art. It may be necessary to include a linker 
moiety between the two domains. Such a linker moiety is typically a short sequence of amino 
acid residues that provides spacing between the domains. So long as the linker does not 
interfere with any of the functions of the binding or repressor domains, any sequence can be 
used. 

As used herein, '^modulating" envisions the inhibition or suppression of expression 
firom a promoter containing a zinc fingerrnucleotide binding motif when it is over-activated, 
or augmentation or enhancement of expression fix>m such a promoter when it is 
underactivated. 

As used herein, the amino acids, which occur in the various amino acid sequences 
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spearing herein, are identified according to their well-known, three-letter or one-letter 
abbreviations. The nucleotides, which occur in the various DNA fragments, are designated 
with the standard single-letter designations used routinely in the art. 

In a peptide or protein, suitable conservative substitutions of amino acids are known 
to those of skill in this art and may be made generally without altering the biological activity 
of the resulting molecule. Those of skill in this art recognize that, in general, single amino 
acid substitutions in non-essential regions of a polypeptide do not substantially alter 
biological activity (see, e.g.. Watson et aL Molecular Biology of the Gene, 4th Edition, 1987, 
The Bejacmin/Cummings Pub. co,, p.224). 

As used herein, "expression vector" refers to a plasmid, virus or other vehicle known 
in the art that has been manipulated by insertion or incorporation of heterologous DNA, such 
as nucleic acid encoding the fusion proteins herein or expression cassettes provided herein. 
Such expression vectors contain a promoter sequence for efficient transcription of the 
inserted nucleic acid in a cell. The expression vector typically contains an origin of 
replication, a promoter, as well as specific genes that permit phenotypic selection of 
transformed cells. 

As used herein, "host cells" aris cells in which a vector can be propagated and its DNA 
expressed. The term also includes any progeny of the subject host cell. It is understood that 
all progeny may not be identical to the parental cell since there may be mutations that occur 
during replication. Such progeny are included when the term "host cell" is used. Methods of 
stable transfer where the foreign DNA is continuously maintained in the host are known in 
the art. 

As used herein, genetic therapy involves the transfer of heterologous DNA to the 
certain cells, target cells, of a mammal, particularly a human, with a disorder or conditions for 
which such therapy is sought. The DNA is introduced into the selected target ceUs in a 
manner such that the heterologous DNA is expressed and a therapeutic product encoded 
thereby is produced. Altematively, the heterologous DNA may in some manner mediate 
expression of DNA that encodes the therapeutic product, or it may encode a product, such as 
a peptide or RNA that in some manner mediates, directly or indirectly, expression of a 
therapeutic product. Genetic therapy may also be used to dehver nucleic acid encoding a 
gene product that replaces a defective gene or supplements a gene product produced by the 
maimnal or the cell in which it is introduced. The introduced nucleic acid may encode a 
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therapeutic compound, such as a growth factor inhibitor thereof, or a tumor necrosis factor or 
inhibitor thereof, such as a receptor therefor, that is not nonnally produced in the manmialian 
host or that is not produced in therapeutically effective amounts or at a therapeutically useful 
time. The heterologous DNA encoding the therapeutic product may be modified prior to 
5 introduction into the cells of the afflicted host in order to enhance or otherwise alter the 

product or expression thereof. Genetic therapy may also involve delivery of an inhibitor or 
repressor or other modulator of gene expression. 

As used herein, heterologous DNA is DNA that encodes RNA and proteins that are 
not normally produced in vivo by the cell in which it is expressed or that mediates or encodes 

10 mediators that alter expression of endogenous DNA by afifecting transcription, translation, or 
other regulatable biochemical processes. Heterologous DNA may also be referred to as 
foreign DNA. Any DNA that one of skill in the art would recognize or consider as 
heterologous or foreign to the cell in which is expressed is herein encompassed by 
heterologous DNA. Examples of heterologous DNA include, but are not limited to, DNA 

15 that encodes traceable marker proteins, such as a protein that confers drug resistance, DNA 
that encodes therapeutically effective substances, such as anti-cancer agents, enzymes and 
hormones, and DNA that encodes other types of proteins, such as antibodies. Antibodies that 
are encoded by heterologous DNA maybe secreted or expressed on the surface of the cell in 
which the heterologous DNA has been introduced. 

20 Hence, herein heterologous DNA or foreign DNA, includes a DNA molecule not 

present in the exact orientation and position as the counterpart DNA molecule found in the 
genome. It may also refer to a DNA molecule jSrom another organism or species (z.e., 
exogenous). 

As used herein, a therapeutically effective product is a product that is encoded by 
25 heterologous nucleic acid, typically DNA, that, upon introduction of the nucleic acid into a 

host, a product is expressed that ameliorates or eliminates the symptoms, manifestations of an 
inherited or acquired disease or that cm-es the disease. Typically, DNA encoding a desired 
gene product is cloned into a plasmid vector and introduced by routine methods, such as 
calcium-phosphate mediated DNA uptake (see. (1981) SomaL Cell Mol Genet, 7:603-616) 
30 or microinjection, into producer cells, such as packaging cells. After amplification in 

producer cells, the vectors that contain the heterologous DNA are introduced into selected 
target cells. 
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As used herein, an expression or delivery vector refers to any plasmid or vims into 
v/hich a foreign or heterologous DNA may be inserted for expression in a suitable host cell — 
/.e., the protein or polypeptide encoded by the DNA is synthesized in the host cells system. 
Vectors capable of directing the expression of DNA segments (genes) encoding one or more 
proteins are referred to herein as "expression vectors". Also included are vectors that allov/ 
cloning of cDNA (complementary DNA) from mRNAs produced using reverse transcriptase. 
As used herein, a gene refers to a nucleic acid molecule whose nucleotide sequence encodes 
an RNA or polypeptide. A gene can be either RNA or DNA. Grenes may include regions 
preceding and following the coding region (leader and trailer) as well as intervening 
sequences (introns) between individual coding segments (exons). 

As used herein, isolated with reference to a nucleic acid molecule or polypeptide or 
other biomolecule means that the nucleic acid or polypeptide has separated from the genetic 
environment from which the polypeptide or nucleic acid were obtained. It may also mean 
altered from the natural state. For example, a polynucleotide or a polypeptide naturally 
present in a living animal is not "isolated", but the same polynucleotide or polypeptide 
separated from the coexisting materials of its natural state is "isolated", as the term is 
employed herein. Thus, a polypeptide or polynucleotide produced and/or contained within a 
recombinant host cell is considered isolated. Also intended as an "isolated polypeptide" or an 
"isolated polynucleotide" are polypeptides or polynucleotides that have been purified, 
partially or substantially, from a recombinant host cell or from a native source. For example, 
a recombinantly produced version of a compound can be substantially purified by the 
one-step method described in Smith et al (1988) Gejie 57:31-40. The terms isolated and 
purified are sometimes used interchangeably. 

Thus, by "isolated" the nucleic acid is free of the coding sequences of those genes 
that, in a naturally-occurring genome immediately flank the gene encoding the nucleic acid of 
interest. Isolated DNA may be single-stranded or double-stranded, and may be genomic 
DNA, cDNA, recombinant hybrid DNA, or synthetic DNA. It may be identical to a native 
DNA sequence, or may differ from such sequence by the deletion, addition, or substitution of 
one or more nucleotides. 

Isolated or purified as it refers to preparations made from biological cells or hosts 
means any cell extract containing the indicated DNA or protein including a cmde extract of 
the DNA or protein of interest. For example, in the case of a protein, a purified preparation 
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can be obtained following an individual technique or a series of preparative or biochemical 
techniques and the DNA or protein of interest can be present at various degrees of purity in 
these preparations. The procedures may include for example, but are not limited to, 
ammonium sulfate fractionation, gel filtration, ion exchange change chromatography, affini ty 
5 chromatography, density gradient centrifugation and electrophoresis. 

A preparation of DNA or protein that is "substantially pure" or "isolated" should be 
understood to mean a preparation free from naturally occurring materials with which such 
DNA or protein is normally associated in nature. **Essentially pure" should be understood to 
mean a *lrighly" purified preparation that contains at least 95% of the DNA or protein of 
10 interest. 

A cell extract that contains the DNA or protein of interest should be understood to 
mean a homogenate preparation or cell-free preparation obtained from cells that express the 
protein or contain the DNA of interest. The term "cell extract" is intended to include culture 
media, especially spent culture media from which the cells have been removed. 

15 As used herein, "modulate" refers to the suppression, enhancement or induction of a 

ftmction. For example, zinc finger-nucleic acid binding domains and variants thereof may 
modulate a promoter sequence by binding to a motif within the promoter, thereby enhancing 
or suppressing transcription of a gene operatively linked to the promoter cellular nucleotide 
sequence. Alternatively, modulation may include inhibition of transcription of a gene where 

20 the zinc finger-nucleotide binding polypeptide variant binds to tiie structural gene and blocks 
DNA dependent RNA polymerase from reading through the gene, thus inhibiting 
transcription of the gene. The structural gene may be a normal cellular gene or an oncogene, 
for example. Alternatively, modulation may include inhibition of translation of a transcript. 
As used herein, "inhibit" refers to the suppression of the level of activation of 

25 transcription of a structural gene operably linked to a promoter. For example, for the methods 
herein the gene includes a zinc finger-nucleotide binding motif. 

As used herein, a transcriptional regulatory region refers to a region that drives gene 
expression in the target cell. Transcriptional regulatory regions suitable for use herein include 
but are not limited to the human cytomegalovirus (CMV) immediate-early 

30 enhancer/promoter, the SV40 early enhancer/promoter, the JC polyomavirus promoter, the 
albumin promoter, PGK and the a-actin promoter coupled to the CMV enhancer. 

As used herein, a promoter region of a gene includes the regulatory elements that 
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typically lie 5* to a structural gene. If a gene is to be activated, proteins known as 
transcription factors attach to the promoter region of the gene. This assembly resembles an 
"on switch" by enabling an enzyme to transcribe a second genetic segment from DNA into 
RNA. In most cases the resulting RNA molecule serves as a template for synthesis of a 
specific protein; sometimes RNA itself is the final product. The promoter region may be a 
normal cellular promoter or, for example, an onco-promoter. An onco-promoter is generally 
a virus-derived promoter. Viral promoters to which zdnc finger binding polypeptides may be 
targeted include, but are not limited to, retroviral long terminal repeats (LTRs), and Leniivirus 
promoters, such as promoters from human T-cell lymphotrophic virus (HTLV) 1 and 2 and 
human immvmodeficiency virus (HIV) 1 or 2. 

As used herein, "effective amount" includes that amount that results in the 
deactivation of a previously activated promoter or that amoimt that results in the inactivation 
of a promoter containing a zinc finger-nucleotide binding motif, or that amount that blocks 
transcription of a stmctural gene or translation of RNA. The amount of zinc finger derived- 
nucleotide binding polypeptide required is that amoimt necessary to either displace a native 
zinc finger-nucleotide binding protein in an existing protein/promoter complex, or that 
amount necessary to compete with the native zinc finger-nucleotide binding protein to form a 
complex with the promoter itself Similarly, the amount required to block a stmctural gene or . 
RNA is that amount which binds to and blocks RNA polymerase from reading through on the 
gene or that amount which inhibits translation, respectively. Preferably, the method is 
performed intracellularly. By fimctionally inactivating a promoter or structural gene, 
transcription or translation is suppressed. Delivery of an eflFective amoimt of the inhibitory 
protein for binding to or "contacting" the cellular nucleotide sequence containing the zinc 
finger-nucleotide binding protein motif, can be accomplished by one of the mechanisms 
described herein, such as by retroviral vectors or liposomes, or other methods well known in 
the art. 

As used herein, 'truncated" refers to a zinc finger-nucleotide binding polypeptide 
derivative that contains less than the fiiU number of zinc fingers found in the native zinc 
finger binding protein or that has been deleted of non-desired sequences. For example, 
truncation of the zinc finger-nucleotide binding protein TFULA, which naturally contains nine 
zinc fingers, might be a polypeptide with only zinc fingers one through three. Expansion 
refers to a zinc finger polypeptide to which additional zinc finger modules have been added. 
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For example, TFTTTA may be extended to 12 fingers by adding 3 zinc finger domains. In 
addition, a truncated zinc finger-nucleotide binding polypeptide may include zinc finger 
modules firom more than one wild type polypeptide, thus resulting in a "hybrid" zinc finger- 
nucleotide binding polypeptide. 

As used herein, *'mutagenized" refers to a zinc finger derived-nucleotide binding 
polypeptide that has been obtained by performing any of the known methods for 
accomplishing random or site-directed mutagenesis of the DNA encoding the protein. For 
instance, in TFDIA, mutagenesis can be performed to replace nonconserved residues in one or 
more of the repeats of the consensus sequence. Truncated zinc finger-nucleotide binding 
proteins can also be mutagenized. 

As used herein, a polypeptide "variant" or "derivative" refers to a polypeptide that is a 
mutagenized form of a polypeptide or one produced through recombination but that still 
retains a desired activity, such as the ability to bind to a ligand or a nucleic acid molecule or 
to modulate transcription. 

As used herein, a zinc finger-nucleotide binding polypeptide "variant" or "derivative" 
refers to a polypeptide that is a mutagenized form of a zinc finger protein or one produced 
through recombination. A variant may be a hybrid that contains zinc finger domain(s) flrom 
one protein linked to zinc finger domain(s) of a second protein, for example. The domains 
may be wild type or mutagenized. A "variant" or "derivative" includes a truncated form of a 
wild type zinc finger protein, which contains less than the original number of fingers in the 
wild type protein. Examples of zinc finger-nucleotide binding polypeptides fi-om which a 
derivative or variant may be produced include TFIDA and zif268. Similar terms are used to 
refer to "variant" or "derivative" nuclear hormone receptors and "varianf ' or "derivative" 
transcription effector domains. * ■ 

As used herein a "zinc finger-nucleotide binding target or motif refers to any two or 
three-dimensional feature of a nucleotide segment to which a zinc finger-nucleotide binding 
derivative polypeptide binds with specificity. Included within this definition are nucleotide 
sequences, generally of five nucleotides or less, as well as the three dimensional aspects of 
the DNA double helix, such as, but are not limited to, the major and nmior grooves and the 
face of the helix. The motif is typically any sequence of suitable length to which the zinc 
finger polypeptide can bind. For example, a three finger polypeptide binds to a motif 
typically having about 9 to about 14 base pairs. Preferably, the recognition sequence is at 
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least about 16 base pairs to ensure specificity within the genome. Therefore, zinc finger- 
nucleotide binding polypeptides of any specificity are provided. The zinc finger binding 
motif can be any sequence designed empirically or to which the zinc finger protein binds. 
The motif may be foimd in any DNA or RNA sequence, including regulatory sequences, 
exons, introns, or any non-coding sequence. 

As used herein, the terms ''phaimaceutically acceptable", "physiologically tolerable" 
and grammaticed variations thereof, as they refer to compositions, carriers, diluoits and 
reagents, are used interchangeably and represent that the materials are capable of 
administration to or upon a human without the production of imdesirable physiological effects 
such as nausea, dizziness, gastric upset and the like which would be to a degree that would 
prohibit administration of the composition. 

As used herein, the term * Vector" refers to a nucleic acid molecule capable of 
transporting between different genetic environments another nucleic acid to which it has been 
operatively linked. Preferred vectors are those capable of autonomous replication and 
expression of structural gaie products present in the DNA segments to which they are 
operatively linked. Vectors, therefore, preferably contain the repUcons and selectable 
markers described earlier. 

As used herein with regard to nucleic acid molecules, including DNA fi-agments, tiie 
phrase "operatively linked" means the sequences or segments have been covalently joined, 
preferably by conventional phosphodiester bonds, into one strand of DNA, whether in single 
or double-stranded form such that operatively linked portions fimctions as intended. The 
choice of vector to which transcription unit or a cassette provided herein is operatively linked 
depends directly, as is well known in the art, on the functional properties desired, e.g., vector 
replication and protein expression, and the host cell to be transformed, these being limitations 
inherent in the art of constructing recombinant DNA molecules. 

As used herein, administration of a therapeutic composition can be effected by any 
means, and includes, but is not limited to, subcutaneous, intravenous, intramuscular, 
intrastemal, infusion techniques, intraperitoneally administration and parenteral 
administration. 



I. The Invention 

The present invention provides zinc finger-nucleotide binding polypeptides. 
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compositions containing one or more such polypeptides, polynucleotides that encode such 
polypeptides and compositions, expression vectors containing such polynucleotides, cells 
transformed with such polynucleotides or expression vectors and the use of the polypeptides, 
compositions, polynucleotides and expression vectors for modulating nucleotide structure 
and/or function. 

n. Polvpeptides 

The present invention provides an isolated and purified zinc finger nucleotide binding 
polypeptide. The polypeptide contains a nucleotide binding region of from 5 to 10 amino 
acid residues and, preferably about 7 amino acid residues. The nucleotide binding region 
binds preferentially to a target nucleotide of the formula CNN, where N is A, C, G or T. 
Preferably, the target nucleotide has the fonnula CAA, CAC, CAG, CAT, CCA, CCC, CCG, 
CCT, CGA, GGC, CGG, CGT, CTA, CTC, CTG or CTT. 

A polypeptide of this invention is non-naturally occurring variant. As used herein, the 
term "non-naturally occurring" means, for example, one or more of the following: (a) a 
peptide comprised of a non-naturally occurring amino acid sequence; (b) a peptide having a 
non-naturally occurring secondary structure not associated with the peptide as it occurs in 
nature; (c) a peptide which includes one or more amino acids not normally associated with the 
species of organism in which that peptide occurs in nature; (d) a peptide which includes a 
stereoisomer of one or more of the amino acids comprising the peptide, which stereoisomer is 
not associated with the peptide as it occurs in nature; (e) a peptide which includes one or 
more chemical moieties other than one of the natural amino acids; or (f) an isolated portion of 
a naturally occurring amino acid sequence (e.g., a truncated sequence). A polypeptide of this 
invention exists in an isolated form and purified to be substantially free of contaminating 
substances. A polypeptide is synthetic in nature. That is, the polypeptide is isolated and 
purified from natural sources or made de novo using techniques well known in the art. 
A zinc finger-nucleotide binding polypeptide refers to a polypeptide that is, preferably, a 
mutagenized form of a zinc finger protein or one produced through recombination. A 
polypeptide may be a hybrid which contains zinc finger domain(s) from one protein linked to 
zinc finger domain(s) of a second protein, for example. The domains may be wild type or 
mutagenized. A polypeptide includes a truncated form of a wild type zinc finger protein. 
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Examples of zinc finger proteins fix)m which a polypeptide can be produced include TFUIA 
and zi£268. 

A zinc finger-nucleotide binding polypeptide of this invention comprises a unique 
heptamer (contiguous sequence of 7 amino acid residues) within the a-helical domain of the 
polypeptide, which heptameric sequence determines binding specificity to a target nucleotide. 
That heptameric sequence can be located anywhere within the a-helical domain but it is 
preferred that the heptamer extend fi-om position -1 to position 6 as the residues are 
conventionally numbered in the art. A polypeptide of this invention can include any P-sheet 
and fi-amework sequences known in the art to function as part of a zinc finger protein. A 
large number of zinc finger-nucleotide bindiog polypeptides were made and tested for binding 
specificity against target nucleotides containing a CNN triplet. 

The zinc finger-nucleotide binding polypeptide derivative can be derived or produced 
fi:om a wild type zinc finger protein by truncation or expansion, or as a variant of the wild 
type-derived polypeptide by a process of site directed mutagenesis, or by a combination of the 
procedures. The term 'truncated" refers to a zinc finger-nucleotide binding polypeptide that 
contains less that the full number of zinc fingers found in the native zinc finger binding 
protein or that has been deleted of non-desired sequences. For example, truncation of the 
zinc finger-nucleotide binding protein TFIHA, which naturally contains nine zinc fingers, 
might be a polypeptide with only zinc fingers one through three. Expansion refers to a zinc 
finger polypeptide to which additional zinc finger modules have been added. For example, 
TFIEA may be extended to 12 fingers by adding 3 zinc finger domains. In addition, a 
truncated zinc finger-nucleotide binding polypeptide may include zinc finger modules firom 
more than one wild type polypeptide, thus resulting in a "hybrid" zinc finger-nucleotide 
binding polypeptide. 

The term *'mutagenized" refers to a zinc finger derived-nucleotide binding polypeptide 
that has been obtained by performing any of the known methods for accompUshing random or 
site-directed mutagenesis of the DNA encoding the protein. For instance, in TFUIA, 
mutagenesis can be performed to replace nonconserved residues in one or more of the repeats 
of the consensus sequence. Truncated zinc finger-nucleotide binding proteins can also be 
mutagenized. Examples of known zinc finger-nucleotide binding polypeptides that can be 
truncated, expanded, and/or mutagenized according to the present invention in order to inhibit 
the function of a nucleotide sequence containing a zinc finger-nucleotide binding motif 
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includes TFIIIA and zi£268. Those of skill in the art know other zinc finger-nucleotide 
binding proteins. 

In one embodiment, a polypeptide of the invention contains a binding region that has 
an amino acid residue sequence with the same nucleotide binding characteristics as any of 
5 SEQ ID NOs; 1-25. A detailed description of how those binding characteristics were 

deteraiined can be foimd hereinafter in the Examples. Such a polypeptide competes for 
binding to a nucleotide target with any of SEQ ID NOs:l-25. That is, a preferred polypeptide 
contains a binding region that will displace, in a competitive manner, the binding of any of 
SEQ IDS NOs: 1-25. Means for determining competitive bindiag are well known in the art. 
10 Preferably, the binding region has the amino acid residue sequence of any of SEQ ID NOs:l- 
25. 

A polypeptide of this invention can be made using a variety of standard techniques 
well known in the art. As disclosed in detail hereinafter in the Examples, phage display 
libraries of zinc finger proteins were created and selected under conditions that favored 

1 5 enrichment of sequence specific proteins. Zinc finger domains recognizing a number of 
sequences required refinement by site-directed mutagenesis that was guided by both phage 
selection data and structural information. 

Previously we reported the characterization of 16 zinc finger domains specifically 
recognizing each of the 5'-GNN-3' type of DNA sequences, that were isolated by phage 

20 display selections based on C7, a variant of the mouse transcription factor Zif268 and refined 
by site-directed mutagenesis [Segal et al., (1 999) Proc Natl Acad Sci USA 96(6), 2758-2763; 
Dreier et al., (2000) /. Mol Biol 303, 489-502; and United States Patent No. 6,140,081, the 
disclosure of which is incorporated herein by reference]. In general, the specific DNA 
recognition of zinc finger domains of the Cys2-His2 type is mediated by the amino acid 

25 residues -1, 3, and 6 of each a-helix, although not in every case are all three residues 

contacting a DNA base. One dominant cross-subsite interaction has been observed firom 
position 2 of the recognition helix. Asp^ has been shown to stabilize the binding of zinc 
finger domains by directly contacting the complementary adenine or cytosine of the 5' 
thymine or guanine, respectively, of the foUowdng 3 bp subsite. These non-modular 

30 interactions have been described as target site overlap. In addition, other interactions of 

amino acids v^th nucleotides outside the 3 bp subsites creating extended binding sites have 
been reported [Pavletich et al., (1991) Science 252(5007), 809-817; Ekod-Erickson et al.. 
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(1996) Structure 4{10), 111 l-USO; Isalon ct al,, (1997) Proc Natl Acad Sci USA94(n), 
5617-5621]. 

Selection of the previously reported phage display hbrary for zinc finger domains 
binding to 5* nucleotides other than guanine or thymine met with no success, due to the cross- 
subsite interaction firom aspartate in position 2 of the j5nger-3 recognition helix RSD-E-LKR 
(SEQ ID NO:26), (Fig. 1). To extend the availability of zinc finger domains for the 
construction of artificial transcription factors, domains specifically recognizing the 5*-ANN- 
3 ' type of DNA sequences were selected (United States Patent Application Serial No. 
09/79 1 , 1 06, filed February 2 1 , 200 1 , the disclosure of which is iacorporated herein by 
reference). Other groups have described a sequential selection method which led to the 
characterization of domains recognizing four 5'-ANN-3' subsites, 5'-AAA-3', 5'-AAG-3', 
5'-ACA3', and 5'-ATA-3' [Greisman et al., (1997) Science 275(5300), 657-661; Wolfe et al., 
(1999) JMol Biol 285(5), 1917-1934]. The present disclosure uses an approach to select zinc 
finger domains recognizing CNN sites by eliminating the target site overlap. First, finger 3 of 
C7 (RSD-E-KKR) (SEQ ID NO:27) binding to the subsite 5'-GCG-3' was exchanged with a 
domain which did not contain aspartate in position 2 (Fig.l). The heUx TSG-N-LVR (SEQ 
ID NO:28), previously characterized in finger 2 position to bind with high specificity to the 
triplet 5'-GAT-3', seemed a good candidate. This 3-finger protein (C7.GAT; Fig. 1 A, lower 
panel), containing finger 1 and 2 of C7 and the 5'-GAT-3 '-recognition helix in fing6r-3 
position, was analyzed for DNA-binding specificity on targets with different finger-2 subsites 
by multi-target ELISA in comparison with the original C7 protein (C7.GCG; Fig. IB). Both 
proteins bound to the 5'-TGG-3' subsite (note that C7.GCG binds also to 5'-GGG-3' due to 
the 5 ' specification of thymine or guanine by Asp^ of finger 3 which has been reported earUer. 
The recognition of the 5' nucleotide of the finger-2 subsite was evaluated using a mixture of 
all 16 5'-XNN-3' target sites (X = adenine, guanine, cytosine or thymine). Indeed, while the 
original C7.GCG protein specified a guanine or thymine in the 5' position of finger 2, 
C7.GAT did not specify a base, indicating that the cross-subsite interaction to the adenine 
complementary to the 5' thymine was abohshed. A similar effect has previously been 
reported for variants of Zif268 where Asp^ was replaced by Ala^ by site-directed mutagenesis 
psalan et al., (1997) Proc Natl Acad Sci USA 94(11), 5617-5621; Dreier et al., (2000) 
Mol Biol. 303, 489-502]. The afianity of C7.GAT, measured by gel mobility shift analysis, 
was fovmd to be relative low, about 400 nM compared to 0.5 nM for C7.GCG [Segal et al.. 
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(1999) Proc Natl Acad Sci USA 96(6), 2758-2763], which may in part be due to the lack of 
the Asp^ in finger 3. 

Based on the 3-finger protein C7.GAT, a hbrary was constructed in the phage display 
vector pComb3H [Barbas et al., (1991) Proc, Natl, Acad. Scl USA 88, 7978-7982; Rader et 
5 al., (1997) Curr. Opin. Biotechnol 8(4), 503-508]. Randomization involved positions -1, 1, 
2, 3, 5, and 6 of the a-helix of finger 2 using a VNS codon doping strategy (V =^ adenine, 
cytosine or guanine, N = adenine, cytosine, guanine or thymine, S = cytosine or guanine). 
This allowed 24 possibilities for each randomized amino acid position, whereas the aromatic 
amino acids Trp, Phe, and Tyr, .as weU as stop codons, were excluded in this strategy. 
10 Because Leu is predominately found in position 4 of the recognition helices of zinc finger 

domains of the type CySj-Hisj this position was not randomized. After transformation of the 
library into ER2537 cells (New England Biolabs) the library contained 1.5 x 10^ members. 
This exceeded the necessary library size by 60-fold and was sufficient to contain all amino 
acid combinations. 

1 5 Six rounds of selection of zinc finger-displaying phage were performed binding to 

each of the sixteen 5'-GAT-CNN-GCG-3' (SEQ ID NO:29) biotinylated hairpin target 
oligonucleotides, respectively, in the presence of non-biotinylated competitor DNA. 
Stringency of the selection was increased in each round by decreasing the amount of 
biotinylated target oligonucleotide and increasing amounts of the competitor oligonucleotide 

20 mixtures. In the sixth round the target concentration was usually 18 nM, S'-ANN-S', 5'- 

GNN-3% and 5'-TNN-3' competitor mixtures were in 5-fold excess for each oligonucleotide 
pool, respectively, and the specific 5'-CNN-3' mixture (excluding the target sequence) in 10- 
fold excess. Phage binding to the biotinylated target oUgonucleotide was recovered by 
capture to streptavidin-coated magnetic beads. Clones were usually analyzed after the sixth 

25 round of selection. 

HI. Compositions 

In another aspect, the present invention provides a pluraUty of zinc finger-nucleotide 
binding polypeptides operatively linked in such a manner to specifically bind a nucleotide 
30 target motif defined as 5'-(CNN)„-3', where n is an integer greater than 1 . The target motif 

can be located within any longer nucleotide sequence (e.g., fi'om 3 to 13 or more TNN, GNN, 
ANN or NNN sequences). Preferably, n is an integer from 2 to about 12, and more preferably 
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from 2 to 6. The individual polypeptides are preferably linked with oligopeptide linkers. 
Such linkers preferably resemble a linker found in naturally occurring zinc finger proteins. A 
preferred linker for use in the present invention is the amino acid residue sequence TGEKP 
(SEQ ID NO:30). Other linkers such as glycine or serine repeats are well known in the art to 
5 link peptides (e.g., single chain antibody domains) and can be used in a composition of this 
invention. 

A polypeptide or composition of this invention can be operatively linked to one or 
more functional peptides. Such functional peptides are well known in the art and can be a 
transcription regulating factor such as a repressor or activation domain or a peptide having 

10 other functions. Exemplary and preferred such functional peptides are nucleases, methylases, 
nuclear locaUzation domains, and restriction enzymes such as endo- or ectonucleases (See, 
e.g.. Chandrasegaran and Smith, Biol. Oiem., 380:841-848, 1999). 

An exemplary repression domain peptide is the ERF repressor domain (ERD) 
(Sgouras, D. N., Athanasiou, M. A., Beal, G. J., Jr., Fisher, R. J., Blair, D. G. & 

15 Mavrothalassitis, G. J. (1995) EMBO J, 14, 4781-4793), defined by amino acids 473 to 530 
of the ets2 repressor factor (ERF). This domain mediates the antagonistic effect of ERF on 
the activity of transcription factors of the ets family. A synthetic repressor is constructed by 
fusion ofthis domain to the N- or C-terminus of the zinc fmger protein. A second repressor 
protein is prepared xising the Knippel-associated box (KRAB) domain (Margolin, J . F., 

20 Friedman, J, R., Meyer, W., K.-H., Vissing, H., Thiesen, H,-J. & Rauscher m, F. J. (1994) 

Proc, Natl Acad, Set USA 91, 4509-4513). This repressor domain is commonly found at the 
N-terminus of zinc finger proteins and presumably exerts its repressive activity on TATA- 
dependent transcription in a distance- and orientation-independent manner (Pengue, G. & 
Lania, L. (1996) Proc, Natl Acad. Set USA 93, 1015-1020), by interacting with the RING 

25 finger protein KAP-1 (Friedman, J. R., Fredericks, W. J., Jensen, D. E., Speicher, D. W., 

Huang, X.-P., Neilson, E. G. & Rauscher HI, F. J. (1996) Genes & Dev. 10, 2067-2078). We 
utilized the KRAB domain found between amino acids 1 and 97 of the zinc finger protein 
KOXl (Margolin, J. F., Friedman, J. R., Meyer, W., K.-H., Vissing, H., Thiesen, H.-J. & 
Rauscher IE, F. J. (1994) Proc. Natl Acad. ScL USA 91, 4509-45 13), hi this case an N- 

30 temiinal fusion with a zinc-finger polypeptide is constructed. Finally, to explore the utility of 
histone deacetylation for repression, amino acids 1 to 36 of the Mad mSIN3 interaction 
domain (SID) are fused to the N-terminus of the zinc finger protein (Ayer, D. E., Laherty, C. 
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D., Lawrence, Q. A., Annstrong, A. P. & Eisenman, R. N. (1996) MoL Cell Biol 16, 5772- 
5781). This small domain is found at the N-temiinus of the transcription factor Mad and is 
responsible for mediating its transcriptional repression by interacting with mSIN3, which in 
turn interacts the co-rq>ressor N-CoR and with the histone deacetylase mRPDl (Heinzel, T., 
Lavinsky, R. M., Mullen, T.-M., SSderstrem, M., Laherty, C. D., Torchia, J., Yang, W.-M., 
Brard, G., Ngo, S. D. & al., e. (1997) Nature 387, 43-46). To examine gene-specific 
activation, transcriptional activators are generated by fusing the zinc finger polypeptide to 
amino acids 413 to 489 of the herpes simplex virus VP 16 protein (Sadowski, L, Ma, J., 
Triezenberg, S. & Ptashne, M. (1988) Nature 335, 563-564), or to an artificial tetrameric 
repeat of VP16's minimal activation domain, (Seipel, KL, Georgiev, O. & Schaffaer, W. 
{1992) EMBO J, 11, 4961-4968), termed VP64. 

A polynucleotide of this invention as set forth above, can be operatively linked to one 
or more transcription modulating or regulating factors. Modulating factors such as 
transcription activators or transcription suppressors or repressors are well known in the art. 
Means for operatively linking polypeptides to such factors are also well known in the art. 
Exemplary and preferred such factors and their use to modulate gene expression are discussed 
in detail hereinafter. 

In order to test the concept of using zinc finger proteins as gene-specific 
transcriptional regulators, six-finger proteins are fiised to a niunber of effector domains. 
Transcriptional repressors are generated by attaching either of three human-derived repressor 
domains to the zinc finger protein. The first repressor protein is prepared using the ERF 
repressor domain (ERD) (Sgouras, D. N., Athanasiou, M. A., Beal, G. J., Jr., Fisher, R. J., 
Blair, D. G. & Mavrothalassitis, G. J. (1995) EMBO J, 14, 4781-4793), defined by amino 
acids 473 to 530 of the ets2 repressor factor (ERF). This domain mediates the antagonistic 
effect of ERF on the activity of transcription factors of the ets family. A synthetic repressor is 
constmcted by fiision of this domain to the C-terminus of the zinc finger protein. The second 
repressor protein is prepared using the Kruppel-associated box (KRAB) domain (Margolin, J. 
F., Friedman, J. R., Meyer, W., K.-H., Vissing, H., Thiesen, H.-J. & Rauscher m, F. J. (1994) 
Proc. Natl Acad. Set USA 91, 4509-4513). This repressor domain is commonly found at the 
N-terminus of zinc finger proteins and presumably exerts its repressive activity on TATA- 
dependent transcription in a distance- and orientation-independent manner (Pengue, G. & 
Lania, L. (1996) Proc. Natl Acad, Set USA 93, 1015-1020), by interacting with the RING 



wo 03/016496 PCT/US02/26388 

20 

finger protein KAP-1 (Friedman, J. R., Fredericks, W. J., Jensen, D. E., Speicher, D. W., 
Huang, X.-P., Neilson, E. G. & Rauscher m, F. J. (1996) Genes &Dev, 10, 2067-2078). We 
utilize the KRAB domain found between amino acids 1 and 97 of the zinc finger protein 
KOXl (Margolin, J. F., Friedman, J. R., Meyer, W., K.-H., Visaing, H., Thiesen, H.-J. & 
5 Rauscher m, F. J. (1994) Proa, Natl Acad. ScL USA 91, 4509-4513). In this case an N- 
temiinal fusion with the six-finger protein is constructed. Finally, to explore the utility of 
histone deacetylation for repression, amino acids 1 to 36 of the Mad mSINB interaction 
domain (SID) are fused to the N-terminus of a zinc finger protein (Ayer, D. E., Laherty, C. 
D., Lawrence, Q. A., Armstrong, A. P. & Eisenman, R. N. (1996) Mol Cell. Biol 16, 5772- 

10 5781). This small domain is found at the N-temiinus of the transcription factor Mad and is 
responsible for mediating its transcriptional repression by interacting with mSIN3, which in 
turn interacts the co-repressor N-CoR and with the histone deacetylase mRPDl (Heinzel, T., 
Lavinsky, R. M., Mullen, T.-M., SSdersti^m, M., Laherty, C. D., Torchia, J., Yang, W.-M., 
Brard, G., Ngo, S. D. & al., e. (1997) Nature 387, 43-46). 

15 To examine gene-specific activation, transcriptional activators are generated by fusing 

the zinc finger protein to amino acids 413 to 489 of the herpes simplex virus VP 16 protein 
(Sadowski, L, Ma, J., Triezenberg, S. & Ptashne, M. (1988) Nature 335, 563-564), or to an 
artificial tetrameric repeat of VP16's minimal activation domain, DALDDFDLDML (SEQ ID 
NO:36) (Seipel, K., Georgiev, O. & Schaffiier, W. (1992) EMBOX 11, 4961-4968), termed 

20 VP64. 

Reporter constructs containing firagments of the erbB-2 promoter coupled to a 
luciferase reporter gene are generated to test the specific activities of our designed 
transcriptional regulators. The target reporter plasmid contains nucleotides -758 to -1 with 
respect to the ATG initiation codon. Promoter firagments display similar activities when 

25 transfected transiently into HeLa cells, in agreement with previous observations (Hudson, L. 

G., Ertl, A. P. & Gill, G. N. (1990) J. Biol Chem, 265, 4389-4393). To test the effect of zinc 
finger-repressor domain fusion constmcts on erbB-2 promoter activity, HeLa cells are 
transiently co-transfected witii zinc finger expression vectors and the luciferase reporter 
constructs. Significant repression is observed with each constmct. The utiUty of gene- 

30 specific polydactyl proteins to mediate activation of transcription is investigated using the 
same two reporter constmcts. 

The data herein show that zinc finger proteins capable of binding novel 9- and 1 8-bp 
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DNA target sites can be rapidly prepared using pre-defined domains recognizing 5'-CNN-3' 
sites. This information is suJBBcient for the preparation of 16^ or 17 million novel six-finger 
proteins each capable of binding 1 8 bp of DNA sequence. This rapid methodology for the 
construction of novel zinc finger proteins has advantages over the sequential generation and 
5 selection of zinc finger domains proposed by others (Greisman, H. A. & Pabo, C. O. (1997) 
Science 275, 657-661) and takes advantage of structural infomiation that suggests that the 
potential for the target overlap problem as defined above might be avoided in proteins 
targeting 5 -CNN-3* sites. Using the complex and well studied erhB-2 promoter and live 
human cells, the data demonstrate that these proteins, when provided with the ^propriate 
10 effector domain, can be used to provoke or activate expression and to produce graded levels 
of repression down to the level of the background in these experiments. 

IV. Polynucleotides, Expression Vectors and Transformed Cells 

The invention includes a nucleotide sequence encoding a zinc finger-nucleotide 

15 binding polypeptide. DNA sequences encoding the zinc finger-nucleotide binding 

polypeptides of the invention, including native, truncated, and expanded polypei5tides, can be 
obtained by several methods. For example, the DNA can be isolated using hybridization 
procedures that are well known in the art. These include, but are not limited to: (1) 
hybridization of probes to genomic or cDNA libraries to detect shared nucleotide sequences; 

20 (2) antibody screening of expression libraries to detect shared structural features; and (3) 

synthesis by the polymerase chain reaction (PGR). RNA sequences of the invention can be 
obtained by methods known in the art (See, for example, Current Protocols in Molecular 
Biology, Ausubel, et aU Eds.. 1989). 

The development of specific DNA sequences encoding zinc finger-nucleotide binding 

25 polypeptides of the invention can be obtained by: (1) isolation of a double-stranded DNA 

sequence fi:om the genomic DNA; (2) chemical manufacture of a DNA sequence to provide 
the necessary codons for the polypeptide of interest; and (3) in vitro synthesis of a double- 
stranded DNA sequence by reverse transcription of mRNA isolated firom a eukaryotic donor 
cell. In the latter case, a double-stranded DNA complement of mRNA is eventually formed 

30 which is generally referred to as cDNA. Of these three methods for developing specific DNA 
sequences for use in recombinant procedures, the isolation of genomic DNA is the least 
common. This is especially true when it is desirable to obtain the microbial expression of 
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mammalian polypeptides due to the presence of introns. For obtaining zinc finger derived- 
DNA binding polypeptides, the synthesis of DNA sequences is fi-equently the method of 
choice when the entire sequence of amino acid residues of the desired polypeptide product is 
known. When the entire sequence of amino acid residues of the desired polypeptide is not 
5 known, the direct synthesis of DNA sequences is not possible and the method of choice is the 
formation of cDNA sequences. Among the standard procedures for isolating cDNA 
sequences of interest is the formation of plasmid-carrying cDNA libraries which are derived 
fi-om reverse transcription of mRNA which is abundant in donor cells that have a high level 
of genetic expression. When used in combination with polymerase chain reaction technology, 

10 even rare expression products can be clones. In those cases where significant portions of the 
amino acid sequence of the polypeptide are known, the production of labeled single or 
double-stranded DNA or RNA probe sequences duplicating a sequence putatively present in 
the target cDNA may be employed in DNA/DNA hybridization procedures which are carried 
out on cloned copies of the cDNA which have been denatured into a single-stranded form 

1 5 (Jay, et al., Nucleic Acid Research 11:2325, 1 983). 

V. Pharmaceutical Compositions 

In another aspect, the present invention provides a pharmaceutical composition 
comprising a therapeutically effective amoimt of a zinc finger-nucleotide binding polypeptide 
20 or composition or a therapeutically effective amount of a nucleotide sequence that encodes a 
zinc finger-nucleotide binding polypeptide in combination with a pharmaceutically acceptable 
carrier. 

As used herein, the terms "pharmaceutically acceptable", ''physiologically tolerable" 
and grammatical variations thereof, as they refer to compositions, carriers, diluents and 

25 reagents, are used interchangeably and represent that the materials are capable of 

administration to or upon a human without the production of undesirable physiological effects 
such as nausea, dizziness, gastric upset and the like which would be to a degree that would 
prohibit administration of the composition. 

The preparation of a pharmacological composition that contains active ingredients 

30 dissolved or dispersed therein is well understood in the art. Typically such compositions are 
prepared as sterile injectables either as liquid solutions or suspensions, aqueous or non- 
aqueous, however, solid forms suitable for solution, or suspensions, in liquid prior to use can 
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also be prepared. The preparation can also be emulsified. The active ingredient can be mixed 
with excipients that are phannaceutically acceptable and compatible with the active 
ingredient and in amounts suitable for use in the therapeutic methods described herein. 
Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol or the like and 
combinations thereof. In addition, if desired, the composition can contain nainor amounts of 
auxiliary substances such as wetting or emulsifying agents, as well as pH buffering agents and 
the like which enhance the effectiveness of the active ingredient. 

The therapeutic pharmaceutical composition of the present invention can include 
phannaceutically acceptable salts of the components therein. Pharmaceutically acceptable 
salts include the acid addition salts (formed with the fi-ee amino groups of the polypeptide) 
that are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, 
or such organic acids as acetic, tartaric, mandelic and the like. Salts formed with the free 
carboxyl groups can also be derived from inorganic bases such as, for example, sodium, 
potassium, ammonium, calcium or* ferric hydroxides, and such organic bases as 
isppropylamine, trimethylamine, 2-ethylainino ethanol, histidine, procame and the like. 
Physiologically tolerable carriers are well known in the art. Exemplary of Uquid carriers are 
sterile aqueous solutions that contain no materials in addition to the active ingredients and 
water, or contain a buffer such as sodium phosphate at physiological pH value, physiological 
saline or both, such as phosphate-buffered saline. Still further, aqueous carriers can contain 
more than one buffer salt, as well as salts such as sodium and potassiimi chlorides, dextrose, 
propylene glycol, polyethylene glycol and other solutes. Liquid compositions can also 
contain liquid phases in addition to and to the exclusion of water. Exemplary of such 
additional liquid phases are glycerin, vegetable oils such as cottonseed oil, organic esters such 
as ethyl oleate, and water-oil emulsions. 

VI. Uses 

In one embodiment, a method of the invention includes a process for modulating 
(inhibiting or suppressing) expression of a nucleotide sequence that contains a CNN target 
sequence. The method includes the step of contacting the nucleotide with an effective 
amount of a zinc finger-nucleotide binding polypeptide of this invention that binds to the 
motif In the case where the nucleotide sequence is a promoter, the method includes 
inhibiting the transcriptional transactivation of a promoter containing a zinc finger-DNA 
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binding motif. The tenn "inhibiting" refers to the suppression of the level of activation of 
transcription of a structural gene operably linked to a promoter, containing a zinc finger- 
nucleotide binding motif, for example, hi addition, the zinc finger-nucleotide binding 
polypeptide can bind a target withm a structural gene or within an RNA sequence. 

The term "effective amount" mcludes that amount which results in the deactivation of 
a previously activated promoter or that amount which results in the inactivation of a promoter 
containing a target nucleotide, or that amount which blocks transcription of a stmctural gene 
or translation of RNA. The amount of zinc finger derived-nucleotide binding polypeptide 
required is that amount necessary to either displace a native zinc finger-nucleotide binding 
protein in an existing protein/promoter complex, or that amount necessary to compete with 
the native zinc finger-nucleotide binding protein to form a complex with the promoter itself. 
Similarly, the amount required to block a structural gene or RNA is that amount which binds 
to and blocks RNA polymerase fi-om reading through on the gene or that amount which 
inhibits translation, respectively. Preferably, the method is performed intraceUularly. By 
functionally inactivating a promoter or structural gene, transcription or translation is 
suppressed. Delivery of an effective amount of the inhibitory protein for binding to or 
"contacting" the cellular nucleotide sequence containing the target sequence can be 
accomplished by one of the mechanisms described herem, such as by retroviral vectors or 
Uposomes, or other methods weU known in the art. The term "modulating" refers to the 
suppression, enhancement or mduction of a function. For example, the zinc finger-nucleotide 
binding polypeptide of the mvention can modulate a promoter sequence by binding to a target 
sequence within the promoter, thereby enhancing or suppressing transcription of a gene 
operatively Unked to the promoter nucleotide sequence. Alternatively, modulation may 
include inhibition of tiianscription of a gene where the zinc finger-nucleotide binding 
polypeptide binds to the sti^lctural gene and blocks DNA dependent RNA polymerase firom 
reading through the gene, thus inhibiting transcription of the gene. The strucfaural gene may 
be a normal cellular gene or an oncogene, for example. Alternatively, modulation may 
include inhibition of translation of a transcript. 

The promoter region of a gene includes the regulatory elements that typically he 5' to 
a structiiral gene. If a gene is to be activated, proteins known as transcription factors attach to 
the promoter region of the gene. This assembly resembles an "on switch" by enabling an 
enzyme to b^scribe a second genetic segment firom DNA to RNA. In most cases the 
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resulting RNA molecule serves as a template for synthesis of a specific protein; sometimes 
RNA itself is the final product. 

The promoter region may be a normal cellular promoter or, for example, an onco- 
piomoter. An onco-promoter is generally a vkus-derived promoter. For example, the long 
5 terminal repeat (LTR) of retroviruses is a promoter region that may be a target for a zinc 
finger binding polypeptide variant of the invention. Promoters firom members of the 
Lentivirus group, which include such pathogens as human T-cell lymphotrophic virus 
(HTLV) 1 and 2, or himian immunodeficiency virus (HIV) 1 or 2, are examples of viral 
promoter regions which may be targeted for transcriptional modulation by a zmc finger 

1 0 binding polypeptide of the invention. 

A target CNN nucleotide sequence can be located in a transcribed region of a gene or 
in an expressed sequence tag. A gene containing a target sequence can be a plant gene, an 
animal gene or a viral gene. The gene can be a eukaryotic or prokaryotic gene such as a 
bacterial gene. The animal gene can be a mammalian gene including a human gene. 

15 In a preferred embodiment, a method of modulating nucleotide expression is accomplished by 
transforming a cell that contains a target nucleotide sequence with a polynucleotide that 
encodes a polypeptide or composition of this invention. Preferably, the encoding 
polynucleotide is contained in an expression vector suitable for use in a target cell. Suitable 
expression vectors are well known in the art. 

20 The CNN target exist in any combiiiation with other target triplet sequences. That is, 

a particular CNN target can exist as part of an extended CNN sequence (e.g., [CNNI2.12) or as 
part of any other extended sequence such as (GNN)i.i2.(ANN)i.i2, (TNN)i:i2 or(NNN)i.i2. 
The Examples that follow illustrate preferred embodiments of the present invention and are 
not limiting of the specification and claims in any way. 

25 

Example 1 ; Construction of zinc finger library and selecti on via vhaee display. 

Construction of the zinc finger library was based on the earlier described C7 protein 
([Wu et al., (1995) PNAS 92, 344-348]; Fig 1 A, upper panel). Finger 3 recognizing the 5'- 
GCG-3' subsite was replaced by a domain binding to a 5'-GAT-3' subsite [Segal et al., 
30 (1999) Proc Natl Acad Sci USA 96(6), 2758-2763] via a overlap PCR strategy using a 
primer coding for finger 3 (5'-GAGGAAGTTTGCCACCAGTGGCAACCTG 
GTGAGGCATACCAAAATC-3') (SEQ ID NO:31) and a pMal-specific primer (5'- 
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GTAAAACGACGGCCAG TGCCAAGC-3 ') (SBQ ID NO:32). Randomization the zinc 
finger Ubrary by PGR overlap extension was essentiaUy as described [Wu et al., (1 995) PNAS 
92, 344-348; Segal et al., (1999) Proc Natl Acad Sci USA 96(6), 2758-2763]. The library 
was Ugated into the phagemid vector pComb3H [Rader et al., (1997) Curr. Opin. Biotechnol. 
8(4), 503-508]. Growth and precipitation of phage were performed as previously described 
[Barbas et al., (1991) Methods: Companion Methods Enzymol. 2(2), 119-124; Baibas et al., 
(1991) Proc. Natl. Acad. Sci. USA 88, 7978-7982; Segal et al., (1999) Proc Natl Acad Sci US 
A 96(6), 2758-2763]. Binding reactions were performed in a volume of 500 ml zinc buffer A 
(ZBA: 10 mM Tris, pH 7.5/90 mM KCl/lm M MgCy90 mM ZnCl2)/0.2% BSA/5 mM 
DTT/1% Blotto (Bioiad)/20 mg double-stranded, sheared herring spenn DNA containing 100 
ml precipitated phage (lO" colony-forming units). Phage were allowed to bind to non- 
biotinylated competitor oligonucleotides for 1 hr at 4°C before the biotinylated target 
oUgonucleotide was added. Binding continued overnight at 4°C. After incubation with 50 ml 
streptavidin coated magnetic beads Pynal; blocked with 5% Blotto in ZBA) for 1 hr, beads 
were washed ten times with 500 ml ZBA/2% Tween 20/5 mM DTT, and once with buffer 
containmg no Tween. Elution of bound phage was performed by incubation in 25 ml trypsin 
(10 mg/ml) in TBS (Tris-buffered sahne) for 30 min at room temperature. Hairpm 
competitor oUgonucleotides had the sequence 5 '- 

GGCCGCN'N'N'ATCGAGTTTTCTCGATNNNGCGGCC-3' (SEQ ID NO:33) (target 
oligonucleotides were biotmylated), where NNN represraits the finger-2 subsite 
oUgonucleotides, N'N'N' its complementary bases. Target oligonucleotides were usually 
added at 72 nM in the first three rounds of selection, then decreased to36nMandl8nMin 
the sixth and last round. As competitor a 5'-TGG-3' finger-2 subsite oligonucleotide was 
used to compete with the parental clone. Anequimolarmixtureof 15 finger-2 5 '-CNN-3' 
subsites, except for the target site, respectively, and competitor mixtures of each finger-2 
subsites of the type 5'-ANN-3', 5'-GNN-3', and 5'-TNN-3' were added in increasing 
amounts with each successive round of selection. UsuaUy no specific 5'-CNN-3' competitor 
mix was added in the first roimd. 

Example 2 : Muhitnrf^et S penificitv Assay nnd Gel mobility shifl analysis - The zinc finger- 
coding sequence was subcloned firom pComb3H mto a modified bacterial expression vector 
pMal-c2 (New England Biolabs). After transformation into XLl-Blue (Stratagene) the zinc 
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finger-maltose-binding protein (MBP) fusions were expressed after addition of 1 nM 
isopropyl b-D-thiogalactoside (IPTG). Freeze/thaw extracts of these bacterial cultures were 
appUed in 1:2 dilutions to 96-well plates coated with streptavidin (Pierce), and were tested for 
DNA-bindmg specificity against each of the sixteen 5'-GAT CNN GCG-3' (SEQ ID NO:34) 
target sites, respectively. ELISA (enzyme-linked immunosorbant assay) was performed 
essentially as described [Segal et al., (1999) Proc Natl Acad Sci U SA 96(6), 2758-2763; 
Dreier et al., (2000) J. Mol. Biol. 303, 489-502]. After incubation with a mouse anti-MBP 
(maltose-binding protein) antibody (Sigma, 1 : 1000). a goat anti-mouse antibody coupled with 
alkaline phosphatase (Sigma, 1 : 1000) was appUed. Detection foUowed by addition of 
alkaline phosphatase substrate (Sigma), and the OD405 was determined with SOFTMAX2.35 
(Molecular Devices). 

Gelshifl analysis was performed with purified protein (Protein Fusion and Purification 
Systran, New England Biolabs) essentially as described. 

Example 3; ^ite-directe d mutaeeneais offinser 2. 

Finger-2 mutants were constructed by PGR as described [Segal et al., (1999) Proc 
Natl Acad Sci USA 96{6), 2758-2763; Dreier et al., (2O00) J. Mol. Biol. 303, 489-502]. As 
PGR template the library clone containing 5'-TGG-3' finger 2 and 5'-GAT-3' finger 3 was 
used. PGR products containing a mutagenized finger 2 and 5'-GAT-3' finger 3 were 
subcloned via Nsil and Spel restriction sites in firame with finger 1 of C7 into a modified 
pMal-c2 vector (New England Biolabs). Three-finger proteins were constructed by finger-2 
stitchery using the SPIC framework as described [Beerli et al., (1998) Proc Natl Acad Sci U 
SA 95(25), 14628-14633]. The proteins generated in this work contained helices recognizing 
5'-GNN-3' DNA sequences [Segal et al., (1999) Proc Natl Acad Sci U SA 96(6), 2758- 
2763], as well as 5'-ANN-3' and 5'-TAG-3* heUces described here. Six finger proteins were 
assembled via compatible Xmal and BsrFI restriction sites. Analysis of DNA-binding 
properties were performed from IPTG-induced jfreeze/thaw bacterial extracts. 

Example 4 : General Methods. 

Transfection and luciferase assays 

HeLa cells were used at a confluency of 40-60%. Cells were transfected with 160 ng 
reporter plasmid (pGL3-promoter constructs) and 40 ng of effector plasmid (zinc finger- 
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eflFector domain fusions in pcDNA3) in 24 well plates. Cell extracts were prepared 48 hrs 
after transfection and measured with luciferase assay reagent (Promega) in a MicroLumat 
LB96P luminometer (EG & Berthold, Gaithersburg, MD). 

Retroviral sene targeting and Flow cytometric analysis 
5 These assays were perfonned as described [Beerli et al., (2000) Proc Natl Acad Sci U 

SA 97(4), 1495-1500; Beerh et al., (2000)/. BioL Chem. 275(42), 32617-32627]. As 
primary antibody an ErbB-l-specific mAb EGFR (Santa Cruz), ErbB-2-specific mAb FSP77 
(gift from Nancy E. Hynes; Harwerth et al., 1992) and an ErbB-3-specific mAb SGPl 
(Oncogene Research Products) were used. Fluorescently labeled donkey F(ab')2 anti-mouse 
1 0 IgG was used as secondary antibody (Jackson Immimo-Research). 

Example 5 .- Bacterial extracts o f vMal-iusionproteins fo r ELISA assays. 

The selected zinc finger proteins were cloned into the pMal vector (New England 
Biolabs) for expression. The constructs were transferred into the E. coli strain XLl-Blue by 

15 electroporation and streaked on LB plates containing 503g/ml carbenecillin. Four single 
colonies of each mutant were inoculated into 3 ml of SB media containing 50 3g/ml 
carbenecillin and 1% glycose. Cultures were grown overnight at 37°C. 1 .2 ml of the cultures 
were transformed into 20 ml of fresh SB media containing 50 3g/ml Carbenecillin, 0.2 % 
glycose, 90 3g/ml ZnClz and grown at 37°C for another 2 hours. IPTG was added to a final 

20 concentration of 0.3 mM. Incubation was continued for 2 hours. The cultures were 

centrifiiged at 4°C for 5 minutes at 3500 rpm in a Beckman GPR centrifuge. Bacterial pellets 
were resuspended in 1 .2 ml of Zinc Buffer A containing 5 mM fresh DTT. Protein extracts 
were isolated by freeze/thaw procedure using dry ice/ethanol and warm water. This 
procedure was repeated 6 times. Samples were centrifiiged at 4°C for 5 minutes in an 

25 ~ Eppendorf centrifiige. The supernatant was transferred to a clean 1 .5 ml centrifiige tube and 
used for the EUSA assays. 

ELISA assays - Finger-2 variants of C7.GAT were subcloned into bacterial expression 
vector as fiision with maltose-binding protein (MBP) and proteins were expressed by 
induction with 1 mM IPTG (proteins (p) are given the name of the finger-2 subsite against 

30 which they were selected). Proteins were tested by enzyme-linked immunosorbant assay 

(ELISA) against each of the 16 finger-2 subsites of the type 5*-GAT CNN GCG-3* (SEQ ID 
NO:34) to investigate their DNA-binding specificity. 
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In addition, the 5 '-nucleotide recognition was analyzed by exposing zinc finger 
proteins to the specific target oligonucleotide and three subsites which differed only in the 5'- 
nucleotide of the middle triplet. For example, pCAA was tested on 5VAAA-3*, 5'-CAA"3', 
5'-GAA-3', and 5*-TAA-3' subsites. Many of the tested 3 -finger proteins showed exquisite 
DNA-binding specificity for the finger-2 subsite against they were selected. (See Table 1 , 
below). 
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TARGET 


ZINC FINGER HEPTAMER 


CAA 


SEQ ID NO: 1 QRHNLTE 




SEQ ID NO:2 QSGNLTE 


CAC 


SEQ ID NO:3 NLQHLGE 


CAG 


SEQ ID NO:4 RADNLTE 




SEQ ID NO:5 RADNLAI 




SEQ ID NO: 14 RSDHLTE 




SEQ ID NO: 16 RSDHLTD 




SEQ ID NO:8 RNDTLTE 


CAT 


SEQ ID NO: I QRHNLTE 




SEQ ID NO:6 NTTHLEH 




SEQ ID NO:24 TKQTLTE 




SEQIDNO:3 NLQHLGE 


CCA 


SEQIDNO:6 NTTHLEH 




SEQ ID NO:25 QSGDLTE 


CCC 


SEQ ID NO:7 SKKHLAE 


CCG 


SEQ ID NO:8 RNDTLTE 




SEQIDNO:9 RNDTLQA 


CCT 


SEQ ID NO:6 NTTHLEH 


CGA 


SEQ ID NOrlO QSGHLTE 




SEQ ID NO: 11 QLAHLKE 




SEQIDNO:12 QRAHLTE 




SEQ ID NO: 1 7 RSDHLTN 


CGC 


SEQ ID NO: 13 HTGHLLE 


CGG 


SEQ ID NO: 14 RSDHLTC 




SEQ ID NO: 15 RSDKLTE 




SEQlDNO:16 RSDHLTO 




SEQ ID NO: 17 RSDHLTO 




SEQ ID NO:8 RNDTLTE 


CGT 


SEQ ID NO: 1 8 SRRTCRA 




SEQ ID NO: 1 9 QLRHLRE 




SEQ ID NO:7 SKKHLAE 
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TARGET 


ZINC FINGER HEPTAMER 


CTA 


SEQ ID NO:20 QRHSLIE 


CTC 


SEQ ID NO:21 QLAHLKE 




SEQlDNO:22 NLQHLGE 


CTG 


SEQ ID NO:23 RNDALTE 




SEQIDNO:5 RADNLAI 




SEQ ID NO: 8 KNDTLTE 




SEQ ID NO: 14 RSDHLTE 




SEQ ID NO:9 RNDTLQA 


CTT 


SEQIDNO:6 NTTHLEH 
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Example 6 ; r7g/ mobilit y xhift assays. 

Zinc finger polypeptides linked to transcription regulating factors are purified to 
>90% homogeneity using the Protein Fusion and Purification System (New England Biolabs), 
except that ZBA/5 mM DTT is used as the column buffer. Protein purity and concentration 
are determined from Coomassie blue-stained 15% SDS-PAGE gels by comparison to BSA 
standards. Target oUgonucleotides are labeled at their 5' or 3' ends with ffl and gel purified. 
Eleven 3-fold serial dilutions of protein are incubated in 20 jil binding reactions (IxBinding 
Buffer/10% giycerol/»l pM target oligonucleotide) for three hours at room temperature, then 
resolved on a 5% polyacrylamide gel in O.SxTBE buffer. Quantitation of dried gels is 
perfonned using a Phosphorlmager and hnageQuant software (Molecular Dynamics), and the 
Kd was determined by scatchard analysis. 

Example 7 : r.nr,.^tniction o frinc Rnper-eifexMnr domain fiision protein? . 

For the construction of zinc finger-effector domain fusion proteins, DNAs encoding 
amino acids 473 to 530 of the ets repressor factor (ERF) repressor domain (ERD) (Sgouras, 
D. N.. Athanasiou, M. A., Beal. G. J., Jr., Fisher, R. J., Blair, D. G. & Mavrothalassitis, G. J. 
(1995) EMBO J. 14, 4781-4793), amino acids 1 to 97 of the KRAB domain of KOXl 
(Margolin, J. F.. Friedman, J. R., Meyer, W., K.-H., Vissing, H., Thiesen, H.-J. & Rauscher 
m, F. J. (1994) Proc. Natl. Acad Sci. USA 91, 4509-4513), or amino acids 1 to 36 of the Mad 
mSIN3 interaction domain (SID) (Ayer, D. E.. Laherty, C. D., Lawrence, Q. A. Armstrong, 
A. P. &Eisemnan, R. N. (1996) Mol. Cell. Biol. 16, 5772-5781) are assembled from 
overiapping oligonucleotides using Taq DNA polymerase. The coding region for amino acids 
413 to 489 of the VP16 transcriptional activation domain (Sadowski, I., Ma, J., Triezenberg, 
S. & Ptashne, M. (1988) Nature 335, 563-564) is PGR amplified from pcDNA3/C7-C7-VP16 
(10). The VP64 DNA, encoding a tetrameric repeat of VP16's minimal activation domain, 
comprising amino acids 437 to 447 (Seipel. K.. Georgiev, O. & Schaf&ier, W. (1992) EMBO 
J. 11, 4961-4968), is generated from two pairs of complementary oligonucleotides. The 
resulting fragments are fiised to zinc finger coding regions by standard cloning procedures, 
such that each resulting construct contained an internal SV40 nuclear localization signal, as 
well as a C-terminal HA decapeptide tag. Fusion constructs are cloned in the eucaryotic 
expression vector pcDNA3 (Invitrogen). 



wo 03/016496 



PCT/US02/26388 



33 

Example 8 ; Construction of lucif er ase reporter ylasmids. 

An erbB-2 promoter fragment comprising nucleotides -758 to -1, relative to the ATG 
initiation codon, is PGR amplified from human bone marrow genomic DNA with the 
TaqExpand DNA polymerase mix (Boehringer Mannheim) and cloned into pGI3basic 
(Promega), upstream of the firefly luciferase gene. A human erbB'2 promoter fragment 
encompassing nucleotides -1571 to -24, is excised from pSVOALD5VerbB-2(N-N) (Hudson, 
L. G., Ertl, A. P. & Gill, G. N. (1990) J. Biol Chem. 265, 4389-4393) hy HindS digestion and 
subcloned into pGL3basic, upstream of the firefly luciferase gene. 

Example 9 ; Luciferase assays . 

For aU transfections, HeLa cells are used at a confluency of 40-60%. Typically, cells 
are transfected with 400 ng reporter plasmid (pGL3-promoter constructs or, as negative 
control, pGL3basic), 50 ng effector plasmid (zinc finger constructs in pcDNA3 or, as negative 
control, empty pcDNA3). and 200 ng internal standard plasmid (phrAct-bGal) in a well of a 6 
well dish using the lipofectamine reagent (Gibco BRL). Cell extracts are prepared 
approximately 48 hours after transfection. Luciferase activity is measured with luciferase 
assay reagent (Promega), bGal activity with Galacto-Light (Tropix), in a MicroLumat LB96P 
luminometer (EG&G Berthold). Luciferase activity is normalized on bGal activity. 

Example 10 : Regulation of the erbB- 2 f^ene in Hela cells. 

The erbB-2 gene is targeted for imposed regulation. To regulate the native erbB-2 
gene, a synthetic repressor protein and a transactivator protem are utilized (R. R. BeerU, D. J. 
Segal, B. Dreier, C. F. Barbas. m, Proc. Natl Acad. ScL USA 95, 14628 (1998)). This DNA- 
bmding protein is constructed from 6 pre-defined and modular zinc finger domains (D. J. 
Segal, B. Dreier, R. R. BeerU, C. F. Barbas, m, Proc. Natl Acad, Sci. USA 96, 2758 (1999)). 
The repressor protein contams the Kox-1 KRAB domain (J. F. Margohn et al, Proc. Natl 
Acad. ScL USA 91, 4509 (1994)), whereas the transactivator VP64 contains a tetrameric 
repeat of the minimal activation domain (K. Seipel, O. Georgiev, W. Schaf&ier, EMBO X 11, 
4961 (1992)) derived from the herpes simplex virus protein VP16. 

A derivative of the human cervical carcinoma cell line HeLa, HeLa/tet-off, is utilized 
(M. Gossen and H. Bujard, Proc. Natl Acad. Sci. USA 89, 5547 (1992)). Since HeLa cells 
are of epithelial origin they express ErbB-2 and are well suited for studies of erbB-2 gene 
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targeting. HeLa/tet-ofF cells produce the tetracycline-controUed transactivator, allowing 
induction of a gene of interest under the control of a tetracycline response element (TRE) by 
removal of tetracycline or its derivative doxycycline (Dox) from the growth medium. We use 
this system to place our transcription factors under chemical control. Thus, repressor and 
5 activator plasmids are constructed and subcloned into pRevTRE (Clontech) using BamHl 

and Clal restriction sites, and into pMX-IRES-GFP pC. Liu et a/., Proc. Natl Acad. Sci, USA 
94, 10669 (1997)] using BamHl and Notl restriction sites. Fidelity of the PGR amplification 
are confirmed by sequencing), transfected into HeLa/tet-off cells, and 20 stable clones each 
are isolated and analyzed for Dox-dependent target gene regulation, (The constructs are 

10 transfected into the HeLa/tet-off cell line (M. Gossen and H. Bujard, Proc. Natl Acad Set 
USA 89, 5547 (1992)) using Lipofectamine Plus reagent (Gibco BRL). After two weeks of 
selection in hygromycin-containing medium, in the presence of 2 mg/ml Dox, stable clones 
are isolated and analyzed for Dox-dependent regulation of ErbB-2 expression. Westem blots, 
* immunoprecipitations, Northern blots, and flow cytometric analyses are carried out 

15 essentially as described [D. Graus-Porta, R. R. Beerh, N. E. Hynes, Mol Cell Biol IS, 1182 
(1 995)] . As a read-out of erbB-2 promoter activity, ErbB-2 protein levels are initially 
analyzed by Westem blotting. A significant fraction of these clones will show regulation of 
ErbB-2 expression upon removal of Dox for 4 days, i.e., downregulation of ErbB-2 in 
repressor clones and upregulation in activator clones. ErbB-2 protein levels are correlated 

20 with altered levels of their specific mRNA, indicating that regulation of ErbB-2 expression is 
a result of repression or activation of transcription. 

Example 11 ; Introduction of the coding regions of the E2S'KRAB. E2S'VP64. ESF-KRAB 
and E3F-VP64 vroteins into the retroviral vector vM X-IRES-GFP. 
25 In order to express the E2S-KRAB, E2S-VP64, E3F-KRAB and E3F-VP64 proteins 

(See Table 2, below) in several cell lines, their coding regions were introduced into the 
retroviral vector pMX-IRES-GFP. 
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Tbe sequences of these constructs were selected to bind to specific regions of the 
ErbB-2 or ErbB-3 promoters (See Table 2). The coding regions were PGR amplified ftom 
pcDNA3-based expression plasmids (R. R. Beerli, D. J. Segal, B. Dreier, C. F. Barbas. m. 
Proc. Natl. Acad. Sci. USA 95, 14628 (1998)) and subcloned into pRevTRE (Clontech) using 
BamHl and Clal restriction sites, and into pMX-IRES-GFP [X. Uu et al., Proc. Natl. Acad. 
Sci. USA 94, 10669 (1997)] using BamHl andNotl restriction sites. Fidelity of the PGR 
amplification was confinned by sequencing. This vector expresses a single bicislronic 
message for the translation of the zinc finger protein and, &om an internal ribosome-entry site 
(IRES), the green fluorescent protein (GFP). Since both coding regions share the same 
mRNA, their expression is physically linked to one another and GFP expression is an 
indicator of zinc finger expression. Virus prepared fiom these plasmids was then used to 
infect the human carcinoma cell line A431. 

Example 12 ; Reflation of ErbB-2 an d ErbB-3 Gene Expression. 

Plasmids from Example 1 1 were transiently transfected mto the amphotropic 
packaging cell line.Phoenix Ampho using Lipofectamine Plus (Gibco BRL) and, two days 
later, culture supematants were used for infection of target cells in the presence of 8 mg/ml 
polybrene. Three days after infection, cells were harvested for analysis. Three days after 
infection, ErbB-2 and ErbB-3 expression was measured by flow cytometry. The results show 
that E2S-KRAB and E2S-VP64 compositions inhibited and enhanced ErbB-2 gene 
expression, respectively. The data also show that E3F-KRAB and E3F-VP64 compositions 
inhibited and enhanced ErbB-2 gene expression, respectively. 

The human erbB-2 and erbB-3 genes were chosen as model targets for the 
development of zinc finger-based tiranscriptional switches. Members of the ErbB receptor 
family play important roles in the development of human malignaicies. hi particular, erbB-2 
is overexpressed as a result of gene amplification and/or transcriptional deregulation in a high 
percentage of human adenocarcinomas arising at numerous sites, including breast, ovary, 
lung, stomach, and saUvary gland (Hynes, N. E. & Stem, D. F. (1994) BiocAfm. Biophys. Acta 
1198, 165-184). Increased expression of ErbB-2 leads to constitutive activation of its 
intrinsic tyrosine kinase, and has been shown to cause the transformation of cultured cells. 
Numerous clinical studies have shown that patients bearing tumors with elevated ErbB-2 
expression levels have a poorer prognosis (Hynes, N. E. & Stem, D. F. (1994) Biochim. 
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Biophys. Acta 1198, 165-184). In addition to its involvement in human cancer, erbB-2 plays 
important biological roles, both in the adult and during embryonal development of mammals 
(Hynes. N. E. & Stem. D. F. (1994) Biochim. Biophys. Acta 1198, 165-184. Altiok, N.. 
Bessereau, J.-L. & Changeux, J.-P. (1995) EMBOJ. 14, 4258-4266, Lee, K.-F., Simon, H., 
Chen, H., Bates, B., Hung, M.-C. & Hauser, C. (1995) Nature 378, 394-398). 

The erbB-2 promoter therefore represents an interesting test case for the development 
of artificial transcriptional regulators. This promoter has been characterized in detail and has 
been shown to be relatively complex, containing both a TATA-dependent and a TATA- 
independent transcriptional initiation site (Ishii, S., Imamoto, P., Yamanashi, Y., Toyoshima, 
K. & Yamamoto. T. (1987) Proc. Natl. Acad. Sci. USA 84, 4374-4378). Whereas early 
studies showed that polydactyl proteins could act as transcriptional regulators that specificaUy 
activate or repress transcription, these proteins bound upstream of an artificial promoter to six 
tandem repeats of the proteins binding site (Uu. Q., Segal, D. J., Ghiara, J. B. & Barbas m, 
C. F. (1997) Proc. Natl. Acad. Sci. USA 94, 5525-5530). Furthermore, this study utilized 
polydactyl proteins that were not modified in their binding specificity. Herein, we tested the 
efficacy of polydactyl proteins assembled firom predefined buUding blocks to bind a single 
site in the native erbB-2 and erbB-3 promoter. 

For generating polydactyl proteins with desired DNA-binding specificity, tiie present 
studies have focused on tiie assembly of predefined zinc finger domains, which contarasts the 
sequential selection strategy proposed by Greisman and Pabo (Greisman, H. A. & Pabo, C. O. 
(1997) Science 275, 657-661). Such a strategy would require the sequential generation and 
selection of six zinc finger Ubraries for each required protein, making this experimental 
approach inaccessible to most laboratories and extremely time-consuming to all. Further, 
since it is difficult to apply specific negative selection against binding alternative sequences 
in this strategy, proteins may result that are relatively unspecific as was recently reported 
(Kim, J.-S. & Pabo, C. O. (1997) J. Biol. Chem. 272, 29795-29800). 

The general utility of two different strategies for generating three-finger proteins 
recognizing 18 bp of DNA sequence was investigated. Each strategy was based on the 
modular nature of the zinc finger domain, and takes advantage of a family of zinc finger 
domains recognizing triplets of the 5'-NNN-3'. Three six-finger protems recognizing 
halfsites erbB-2 or erbB-3 target sites were generated in the first strategy by fusing the pre- 
defined finger 2 (F2) domain variants together using a PGR assembly sti-ategy. 
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The affinity of each of the proteins for its target was deteimined by electrophoretic 
mobihty-shift assays. These studies demonstrated that the zinc finger peptides have afiSnities 
comparable to Zif268 and other natural transcription factors. 

The affinity of each protein for the DNA target site is detemained by gel-shifl analysis. 
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WHAT IS CLAIMED IS: 

1. An isolated and purified zinc finger nucleotide binding polypeptide comprising a 
nucleotide binding region of fi"om 5 to 10 amino acid residues, which region binds 
preferentially to a target nucleotide of the formula CNN, where N is A, C, G or T. 

2. The polypeptide of claim 1 wherein the target nucleotide has the formula CAN, CCN, 
CGN, CTN, CNA, CNC, CNG or CNT. 

3. The polypeptide of claim 1 wherein the target nucleotide has the formula CAA, CAC, 
CAG, CAT, CCA, CCC, CCG, CCT, CGA, CGC. CGG, CGT, CTA, CTC, CTG or 
CTT. 

4. The polypeptide of claim 1 wherein the binding region has an amino acid residue 
sequence with the same nucleotide binding characteristics as any of SEQ ID NOs:l- 
25. 

5. The polypeptide of claim 1 that competes for binding to a nucleotide target with any 
ofSEQIDNOs:l-25. 

6. The polypeptide of claim 1 wherein the binding region has the amino acid residue 
sequence of any of SEQ ID NOs: 1 -25 . 

7. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 1 . 

8. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:2. 

9. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:3. 
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1 0. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:4. 

11. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO;5. 

12. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:6. 

13. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO;7. 

14. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID N0:8. 

15. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:9. 

16. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 10. 

17. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 11. 

18. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 12. 

19. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO : 1 3 . 



20. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 14. 



wo 03/016496 



PCTAJS02/26388 



41 

21. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 15. 

22. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 16. 

23. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 17. 

24. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 18. 

25. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO: 19. 

26. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:20. 

27. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:2 1 . 

28. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:22. 

29. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:23. 

30. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of SEQ ID NO:24. 

31. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
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amino acid residue sequence of SEQ ID NO;25. 

32. An isolated and purified zinc finger nucleotide binding polypeptide consisting of an 
amino acid residue sequence of any of SEQ ID NOs: 1-25. 

33 . A peptide composition comprising a plurality of the polypeptide of claim 1 , wherein 
the polypeptides are operatively linked to each other. 

34. The peptide composition of claim 33 wherein operatively linked is linked via a 
flexible peptide linker of from 5 to 1 5 amino acid residues. 

35 . The peptide composition of claim 34 wherein the flexible peptide linker has the amino 
acid residue sequence of SEQ ID NO:30. 

36. The peptide composition of claim 33 wherein a plurality is from 2 to 12. 

37. The peptide composition of claim 33 wherein a plurality is from 2 to 6. 

38. The peptide composition of claim 36 that binds to a nucleotide sequence that 
comprises a sequence of the formula 5'-(CNN)n-3' , where N is A, C, G or T and n is 
2 to 12. 

39. The peptide composition of claim 38 wherein the sequence 5' -(CNN)n-3' is located 
within a sequence of the formula 5 -(NNN)2.i3-3'. 

40. The peptide composition of claim 38 that binds to a nucleotide sequence with a Kjj of 
from 1 £Mto lOjiM. 

41. The peptide composition of claim 38 that binds to a nucleotide sequence with a of 
from lOfMto 1 ^iM. 

42. The peptide composition of claim 38 that binds to a nucleotide sequence with a of 
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from lOpMto 100 nM, 

43. The peptide composition of claim 38 that binds to a nucleotide sequence with a of 
from 100 pM to 10 nM. 

44. The peptide composition of claim 38 that binds to a nucleotide sequence with a of 
fix)m 1 nMto lOnM. 

45. The polypeptide of claim 1 operatively linked to one or more transcription regulating 
factors. 

46. The polypeptide of claim 45 wherein the transcription regulating factor is a repressor 
of transcription. 

47. The polypeptide of claim 45 wherein the transcription regulating factor is an activator 
of transcription. 

48. The peptide composition of claim 33 operatively linked to one or more transcription 
regulating factors . 

49. The composition of claim 48 wherein the transcription regulating factor is an activator 
of transcription. 

50. The composition of claim 48 wherein the transcription regulating factor is a repressor 
of transcription. 

51. An isolated and purified polynucleotide that encodes the polypeptide of claim 1. 

52. An isolated and purified polynucleotide that encodes the peptide composition of claim 
33. 



53. An expression vector that contains the polynucleotide of claim 5 1 . 
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54. An expression vector that contains the polynucleotide of claiiQ 52. 

55. A host cell transfonned with the polynucleotide of claim 5 1 . 

56. A host cell transfonned with the polynucleotide of claim 52. 

57. A host cell transformed with the expression vector of claim 53. 

58. A host cell transformed with the expression vector of claim 54. 

59. A process of regulating expression of a nucleotide sequence that contains the 
sequence 5 -(CNN)„-3', where n is 2 to 12, the process comprising exposing the 
nucleotide sequence to an effective amount of the composition of claim 33. 

60. The process of claim 59 wherein the sequence 5'-(CNN)„-3' is located in located 
within a 5'-(TNN)-3 ' sequence. 

6 1 . The process of claim 59 wherein the sequence 5 -(CNN)„-3' is located in the 
transcribed region of the nucleotide sequence. 

62. The process of claim 59 wherein the sequence 5 -(CNN)„-3' is located in a promotor 
region of the nucleotide sequence. 

63. The process of claim 59 wherein the sequence 5 -(CNN)n-3' is located within an 
expressed sequence tag. 

64. The process of claim 59 wherein the composition is operatively linked to one or more 
transcription regulating factors. 

65. The process of claim 64 wherein the transcription regulating factor is a repressor of 
transcription. 
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66. The process of claim 64 wherein the transcription regulating factor is an activator of 
transcription. 

67. The process of claim 59 wherein the nucleotide sequence is a gene. 

68. The process of claim 67 wherein the gene is a eukaryotic gene. 

69. The process of claim 59 wherein the gene is a prokaryotic gene. 

70. The process of claim 59 wherein the gene is a viral gene. 

71 . The process of claim 68 wherein the eukaryotic gene is a mammalian gene. 

72. The process of claim 71 wherein the mammalian gene is a himian gene. 

73. The process of claim 68 wherein the eukaryotic gene is a plant gene. 

74. The process of claim 69 wherein the prokaryotic gene is a bacterial gene. 
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